A sound source classification system based on subband processing
نویسندگان
چکیده
A classification system that aims to recognize the presence of sounds from different sources is described. The type of audio signals considered are speech, music, noise and silence. Appropriate subband processing is applied for the characterization of each sound source. The algorithm operates in four steps to classify the contents of a given audio signal. The acoustical parameters and statistical measures to be used in the classification process are obtained via an off-line training procedure. In the silence and onset detection stages, we aim to label the starting and finishing instants of the acoustical events present in the audio signal. Acoustical parameters of the given signal are extracted and classification is carried out using linear discrimination with common covariance matrix. Experimental work is carried out on a database that contains mixtures of human speech, musical instruments, background noise of different types and silence. Experimental results demonstrate that the system yields %XX.X classification success for speech/music mixtures, %XX.X for speech/noise mixtures, %XX.X for different musical instruments, %XX.X for mixtures containing speech, music and noise.
منابع مشابه
Classification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملIdentification of Houseplants Using Neuro-vision Based Multi-stage Classification System
In this paper, we present a machine vision system that was developed on the basis of neural networks to identify twelve houseplants. Image processing system was used to extract 41 features of color, texture and shape from the images taken from front and back of the leaves. The features were fed into the neural network system as the recognition criteria and inputs. Multilayer perceptron (MLP) ne...
متن کاملSeparating three simultaneous speeches with two microphones by integrating auditory and visual processing
This paper addresses the problem of automatic recognition of three simultaneous speeches with two microphones, that is, that of sound source separation where the number of sound sources is greater than that of microphones. The approach used is the direction-pass filter, which is implemented by hypothetical reasoning on the interaural phase difference (IPD) and interaural intensity difference (I...
متن کاملImage Classification via Sparse Representation and Subspace Alignment
Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...
متن کاملRobust speech recognition in reverberant environments using subband-based steady-state monaural and binaural suppression
The precedence effect describes the ability of the auditory system to suppress the later-arriving components of sound in a reverberant environment, maintaining the perceived arrival azimuth of a sound in the direction of the actual source, even though later reverberant components may arrive from other directions. It is also widely believed that precedence-like processing can also improve speech...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002